Two Stage GP-UCB for Open Loop Grasping
نویسنده
چکیده
In this project I attempt to learn open loop grasping parameters by implementing a two stage Gaussian Process (GP) based Upper Confidence Bound bandit solving algorithm. Through the course of the project, parametrized grasping strategies were developed, grasp verification using soft fingers via tandem grasp was developed, the bandit algorithm to exploit minimal number of robot interactions and learn parameters was developed, and rudimentary block detection algorithm was developed.
منابع مشابه
Time-Varying Gaussian Process Bandit Optimization
We consider the sequential Bayesian op-timization problem with bandit feedback,adopting a formulation that allows for the re-ward function to vary with time. We modelthe reward function using a Gaussian pro-cess whose evolution obeys a simple Markovmodel. We introduce two natural extensionsof the classical Gaussian process upper confi-dence bound (GP-UCB) algorit...
متن کاملDistributed Batch Gaussian Process Optimization
This paper presents a novel distributed batch Gaussian process upper confidence bound (DB-GP-UCB) algorithm for performing batch Bayesian optimization (BO) of highly complex, costly-to-evaluate black-box objective functions. In contrast to existing batch BO algorithms, DBGP-UCB can jointly optimize a batch of inputs (as opposed to selecting the inputs of a batch one at a time) while still prese...
متن کاملGaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GP...
متن کاملOptimization as Estimation with Gaussian Processes in Bandit Settings (Supplement)
In this supplement, we provide proofs for all theorems and lemmas in the main paper, more exhaustive experimental results and details on the experiments. 1 Proofs 1.1 Proofs from Section 2 Lemma 2.1. In any round t, the point selected by EST is the same as the point selected by a variant of GP-UCB with λ t = min x∈Xˆmt−µt−1(x) σt−1(x). Conversely, the candidate selected by GP-UCB is the same as...
متن کاملOptimization as Estimation with Gaussian Processes in Bandit Settings
Recently, there has been rising interest in Bayesian optimization – the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior. We study an optimization strategy that directly uses an estimate of the argmax of the function. This strategy offers both practical and theoretical advantages: no tradeoff parameter needs to be selected, and, moreover, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015